LELIO: An Auto-Adaptative System to Acquire Domain Lexical Knowledge in Technical Texts

نویسنده

  • Patrick Saint-Dizier
چکیده

In this paper, we investigate some language acquisition facets of an auto-adaptative system that can automatically acquire most of the relevant lexical knowledge and authoring practices for an application in a given domain. This is the LELIO project: producing customized LELIE solutions. Our goal, within the framework of LELIE (a system that tags language uses that do not follow the Constrained Natural Language principles), is to automate the long, costly and error prone lexical customization of LELIE to a given application domain. Technical texts being relatively restricted in terms of syntax and lexicon, results obtained show that this approach is feasible and relatively reliable. By auto-adaptative, we mean that the system learns from a sample of the application corpus the various lexical terms and uses crucial for LELIE to work properly (e.g. verb uses, fuzzy terms, business terms, stylistic patterns). A technical writer validation method is developed at each step of the acquisition.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Other-Anaphora Resolution in Biomedical Texts with Automatically Mined Patterns

This paper proposes an other-anaphora resolution approach in bio-medical texts. It utilizes automatically mined patterns to discover the semantic relation between an anaphor and a candidate antecedent. The knowledge from lexical patterns is incorporated in a machine learning framework to perform anaphora resolution. The experiments show that machine learning approach combined with the auto-mine...

متن کامل

The Deep Lexical Semantics of Event Words

But doing this requires fairly complex inference, because the words “block”, “enter”, “can”, “not” and “deliver” carve up the world in different ways.1 Words describe the world, so if we are going to draw the appropriate inferences in understanding a text, we must have underlying theories of aspects of the world and we must have axioms that link these to words. This includes domain-dependent kn...

متن کامل

A Correlational Study of Expectancy Grammar’s Manifestation on Cloze Test and Lexical Collocational Density

The notion of expectancy grammar as a key to understanding the nature of psychologically real processes that underlie language use is introduced by Oller (1979). A central issue in this notion is that expectancy generating systems are constructed and modified in the course of language acquisition. Thus, one of the characteristics of language proficiency is that it consists of such an expectancy...

متن کامل

Automatically Augmenting Terminological Lexicons from Untagged Text

Lexical resources play a crucial role in language technology but lexical acquisition can often be a time-consuming, laborious and costly exercise. In this paper, we describe a method for the automatic acquisition of technical terminology from domain restricted texts without the need for sophisticated natural language processing tools, such as taggers or parsers, or text corpora annotated with l...

متن کامل

A corpus-based approach to Information Extraction

This paper presents an Information Extraction (IE) system. This kind of system is intended to extract structured information from general texts. An evaluation is performed and the results are discussed. We show that, if IE is now an established technology, it suffers a number of limitations that prevent its dissemination through general public applications. To get over this obstacle, systems ha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016